A Semantic Approach to Kanji Lexicography

نویسنده

  • Jack Halpern
چکیده

The Japanese script consists of two phonetic syllabaries, called hiragana (eg ft* /ka/) and katakana (eg # /ka/), and thousands of Chinese characters, called kanji (eg ^ hon). Chinese characters have three basic properties: form, sound, and meaning. Many characters are of complex shape, some having more than twenty or even thirty strokes. Each character may be pronounced according to its Chinese derived on read­ ing, or to one of several native Japanese kun readings, and each reading may be associated with one or several (sometimes highly) polysemous words and/or mor­ phemes. Moreover, because of the presence of numerous homophones, the Japanese script is highly complex and requires considerable effort to learn. Japanese has been the subject of various linguistic studies, but little attention has been given to the systematic analysis of its writing system. Kanji are combined with each other to generate countless compound words from a basic stock of a few thou­ sand units. They function as a network of interrelated parts, not as a set of discon­ nected symbols. Though this is vaguely recognized by educators, it has been largely disregarded in the development of teaching programmes and the compilation of char­ acter dictionaries. The traditional approach places emphasis on the rote memoriza­ tion of characters and compounds, while the meanings and functions of individual characters are often ignored. Although the number of Japanese language learners has more than quadrupled over the past decade, a character dictionary for non-Japanese users analysing the semantics of kanji has never been compiled. The few existing works are either out-ofdate, contain many inaccuracies, or fail to show the meanings and functions of characters on the morphemic level. The lack of effective tools to overcome the diffi­ culties posed by the Japanese script has been one of the principal reasons for the relatively small number of foreigners to have truly mastered the language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Character-based Approach to Distributional Semantic Models: Exploiting Kanji Characters for Constructing JapaneseWord Vectors

Many Japanese words are made of kanji characters, which themselves represent meanings. However traditional word-based distributional semantic models (DSMs) do not benefit from the useful semantic information of kanji characters. In this paper, we propose a method for exploiting the semantic information of kanji characters for constructing Japanese word vectors in DSMs. In the proposed method, t...

متن کامل

Evaluation of distributional semantic models: a holistic approach

We investigate how both model-related factors and application-related factors affect the accuracy of distributional semantic models (DSMs) in the context of specialized lexicography, and how these factors interact. This holistic approach to the evaluation of DSMs provides valuable guidelines for the use of these models and insight into the kind of semantic information they capture.

متن کامل

A Heuristic Algorithm for Nonlinear Lexicography Goal Programming with an Efficient Initial Solution

In this paper,  a heuristic algorithm is proposed in order to solve a nonlinear lexicography goal programming (NLGP) by using an efficient initial point. Some numerical experiments showed that the search quality by the proposed heuristic in a multiple objectives problem depends on the initial point features, so in the proposed approach the initial point is retrieved by Data Envelopment Analysis...

متن کامل

Contribution of sublexical information to word meaning: An objective approach using latent semantic analysis and corpus analysis on predicates

Past studies have employed a subjective rating/categorization methodology to investigate whether radicals, an example of sub-lexical visual information in Chinese/kanji, contribute to computation of character/word meaning, with conflicting results. This study took an objective, corpus-based approach for the first time. Specifically, we conducted a Latent Semantic Analysis based on Japanese news...

متن کامل

Semantic involvement in the lexical and sentence processing of Japanese kanji.

This study examined how skilled Japanese readers activate semantic information when reading kanji compound words at both the lexical and sentence levels. Experiment 1 used a lexical decision task for two-kanji compound words and nonwords. When nonwords were composed of kanji that were semantically similar to the kanji of real words, reaction times were longer and error rates were higher than wh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008